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A  sequential  method  of  random  allocation  is  given  and  it  is 
shown  how  it  can  be  used  to  estimate  the  observed  significance 
levels  of  k-sample  nonparametric  tests.  The  sequential  technique 
is  compared  to  the  standard  random  allocation  technique  and 
shown  to  be  more  efficient.  An  application  is  made  to  the  Dunn- 
Bonferroni  method  of  multiple  comparisons. 
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SIGNIFICANCE  AND  EXPLANATION 


A  major  problem  in  using  nonparametric  tests  is  to  determine 
the  observed  level  of  significance.  This  paper  gives  an  efficient 
sequential  simulation  method  for  doing  this. 


The  responsibility  for  the  wording  and  views  expressed  in  this 
descriptive  summary  lies  with  MRC,  and  not  with  the  author  of 
this  report. 


A  SEQUENTIAL  k-GROUP  RANDOM  ALLOCATION  METHOD 
WITH  APPLICATIONS  TO  SUMMATION 


Andrew  P .  Soms 

1.  The  Sequential  Allocation  Method 

Bebbington  (1975)  showed  that  if  there  were  N  objects  (such 
as  file  cards)  from  which  it  was  desired  to  select  (without 
replacement  here  and  throughout)  a  random  sample  of  size  k  without 
numbering  the  N  objects,  then  one  could  proceed  sequentially  by 
selecting  the  first  object  with  probability  k/N  and  if  at  the  T— 

cf 

stage  s  have  been  selected,  then  the  T+l —  object  is  selected  with 

probability  (k-s)/(N-T),  T  = 1,2, . . , ,N-1. 

We  now  state  and  prove  the  extension  to  an  arbitrary  number 

of  groups.  Suppose  there  are  N  objects  and  it  is  desired  to 

sequentially  divide  them  randomly  into  r  groups  of  size 
r 

k. ,k~ , . . . ,k  ,  l  k.  =  N,  i.e.,  each  allocation  has  probability 
1  *  r  i*=l  1 

Let  slT,...,srT  be  the  number  of  objects  selected 

for  groups  l,2,...,r  at  the  T—  stage  and  let  Pi  T+1  denote  the 

st 

selection  probability  for  group  i  at  the  T+l —  stage.  Then  if 


i/(k  N  J 

1*1 » • • • 'KrJ 


Pi,T+l  =  (ki~SiT)/(N”T)  '  T  =  0'1 . N"1  ' 


(1.1) 


the  selection  is  random.  Note  that  P^  ^  =  k^N  and  £  Pi,T+l“1’ 
The  randomness  follows  immediately  by  noting  that  the  probability 
of  a  particular  assignment  is 


Bebbington’ s  (1975)  result  is  a  special  case  of  the  above 
when  r  «  2 . 
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As  an  example,  suppose  r  =  3,  k1  =  2,  k2  =  2,  k^  =  3  and  N  =  7. 


In  order  to  make  the  sequential  allocation  given  by  (1.1) 

we  take 

7  independent  random  numbers 

.  Let 

°0,T  ’  0  ana  Qi,T  " 

jLP5-T' 

i  B  1|2|  • •  t  |T 

,  T  *  1 

,2 , . . . ,N. 

Then  the  m^^1  object,  m  = 

1/2, ... ,N, 

is  assigned 

to  group 

n,  where 

n  is  the  unique  integer 

such  that 

(5  .  < 

U  <  Q 

n-1  ,m 

m  —  n,m 

Suppose  the  7  random  numbers  are  .79039,  .01850, 

.99744, 

.81812, 

*93169,  .22705,  and  .97709.  The  selection  process  is  summarized 

in  Table  1. 

1. 

Selection 

Process 

Random 

p 

p 

•D 

Group 

Stage  Digit 

*1T 

2T 

3T  Selected 

1  .79039 

2/7 

2/7 

3/7 

3 

2  .01850 

2/6 

2/6 

2/6 

1 

3  .99744 

1/5 

2/5 

2/5 

3 

4  .81812 

1/4 

2/4 

1/4 

3 

5  .93169 

1/3 

2/3 

0 

2 

6  .22705 

1/2 

1/2 

0 

1 

7  .97709 

0 

1 

0 

2 

Note  that  if  all  the  k^'s  are  one,  a  random  permutation  is 
produced  if  we  think  of  the  group  as  denoting  position. 


2.  Applications  to  Simulation 

In  k- sample  nonparametric  tests  the  observed  significance 
level  of  the  test  is  obtained  by  considering  all  possible  parti¬ 
tions  M  of  the  (possibly  tied)  observed  values  or  (possibly 
average)  ranks  into  r  groups,  computing  the  value  of  the  test 
statistic,  and  counting  the  number  of  times  m  it  is  equal  to  or 
greater  than  the  observed  value.  The  observed  significance  level 

A 

o  is  then  m/M.  When  the  number  of  partitions  is  large  this  is 
prohibitive  and  a  is  estimated  either  by  simulation  (taking  a 
large  random  sample  of  the  allocations)  or  by  asymptotics.  The 
advantage  of  simulation  is  that  one  can  control  the  accuracy  of 
the  estimate  (by  taking  a  large  or  small  random  sample)  depending 
on  the  importance  of  the  situation,  unlike  asymptotics  which  each 
time  it  is  used  forces  one  into  the  straight- jacket  of  committing 
a  usually  unknown  error.  Since  it  is  (perhaps  regrettably)  a  well 
known  fact  that  different  actions  will  be  taken  for  close  values 
of  a,  one  above  and  the  other  below  some  fixed  level  (e.g.,  .01, 
.05,  or  .1)  of  the  decision-maker,  the  use  of  simulation  at  least 

A 

prevents  approximating  error  in  a  to  be  the  determining  factor. 

If  it  is  decided  to  use  simulation,  then  a  possible  procedure 
is  to  make  the  random  assignment  as  described  in  Section  1  many 
times  by  using  a  computer.  The  commonly  used  method  is  to  produce 
a  random  permutation  by  ordering  a  random  sample  of  uniform  num¬ 
bers  and  choosing  the  first  k^  indexes  for  group  1,  the  next  k2 
for  group  2,  and  so  on.  If  all  the  k^'s  are  one,  then  this  is 
more  efficient  than  Section  1.  However,  as  soon  as  the  k^'s 
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depart  even  moderately  from  1,  the  method  of  Section  1  becomes 
much  more  efficient. '  As  an  example,  if  k^  =  k2  =  k^  =  k^  =  10  and  it 
is  desired  to  make  2000  random  assignments  using  a  UNIVAC  1110  computer, 
a  FORTRAN  program  using  the  methods  of  Section  1  uses  4.71  seconds  of 
CPU  time  while  a  FORTRAN  program  using  the  random  permutation  method 
takes  9.17  seconds. 

The  Appendix  contains  a  listing  of  the  FORTRAN  subroutine 
RANDM  that  uses  the  theory  of  Section  1  to  make  random  assignments. 

This  may  be  tied  in  with  any  specific  simulation  problem,  e.g., 
the  case  treated  in  Section  3. 


3.  Applications  to  the  Dunn-Bonferroni  Method 
of  Multiple  Comparisons 


> 


The  D-B  (Dunn-Bonferroni)  method  is  described  in  Dunn  (1964) . 
Briefly,  let  ,  i  =  l,2,...,r,  j  =  1,2, . . .  ,ni,  be  continuous  (this 
assumption  is  not  important  and  is  removed  later)  random  variables 
with  distribution  function  F^,  Hq:  =^r»  Ha:  ^or  at 

least  one  pair  (i,j),  Fi  /  F  j  6Ln  the  sense  of  producing  larger  or 
smaller  values) ,  and  the  test  must  identify  which,  if  any,  pairs 
are  different.  Denote  by  z the  upper  a—  point  of  the  standard 
normal.  The  D-B  test  declares  all  those  pairs  (i,j),  i<  j, 
different  for  which 


.  =  |  r  .  — r  .  |  /  SSLfpiL  fi  +  X] 

J  1  a.  3  7  12  ln±  Hj  J 


1/2 


>  za/(k(k-l))  ,  (3.1) 


_  i.U 

where  denotes  the  average  of  the  ranks  of  the  i—  group  in  the 
joint  ranking.  The  nominal  significance  level  of  this  procedure 
is  a.  The  actual  significance  level  a  is 

ii 


“A  "  P0 


I  Max 

i<  j 


zij  -  V(k(k‘1)) 


(3.2) 


and  may  be  obtained  by  simulation  based  on  Section  1.  Table  2 
gives  some  comparisons  of  nominal  with  actual,  using  Section  1  and 
10,000  simulations. 
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2.  Comparison  of  Actual  to  Nominal  a 


r 

Common 

Group  Size 

Nominal  a 

Actual  a 

3 

5 

.05 

.037 

3 

10 

.05 

.040 

3 

15 

.05 

.043 

3 

30 

.05 

.045 

3 

5 

.01 

.0030 

3 

10 

.01 

.0077 

3 

15 

.01 

.0077 

3 

30 

.01 

5 

5 

.05 

.026 

5 

10 

.05 

.036 

5 

5 

.01 

.0030 

5 

10 

.01 

.0067 

The  Appendix  contains  a  listing  of  the  program  used  for  Table  2 
It  thus  appears  that  D-B  is  conservative  and  we  can  remove  the 
conservatism  by  substituting  for  za/(k(k-l))  d^,  where  d^j, 
i = 1, 2, . . . ,r (r-l)/2,  is  the  i—  largest  observed  values  of  Z^, 
i<j,  to  obtain  by  simulation  the  r(r-l)/2  possible  observed 
significance  levels. 
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The  K-S  (Kruskal-Schef f 6)  method  is  also  sometimes  used  in 
this  situation  (see,  e.g.,  Miller,  1966,  p.  166)  and  consists  of 
replacing  za/(r(r-l))  in  (3.1)  with  h*/2  =  <xj;r.1)1/2#  where 
Xa;r_i  is  the  upper  a—  point  of  x  with  r-1  degrees  of  freedom. 
The  comparison  of  the  critical  constants  in  Table  3  shows  that 
this  is  even  more  conservative  than  D-B. 


3.  Comparison  of  D-B  and  K-S  Critical  Constants 


r 

z.Os/^r-D) 

<^05;r-l)1/2  2. 

0l/(r(r“ 

D) 

(x2  >1/2 

. 01; r-l# 

3 

2.39 

2.79 

2.94 

3.36 

4 

2.50 

3.08 

3.02 

3.65 

5 

2.58 

3.33 

3.09 

3.89 

6 

2.64 

3.55 

3.15 

4.10 

7 

2.69 

3.75 

3.19 

4.30 

8 

2.74 

3.94 

3.23 

4.48 

9 

2.77 

4.11 

3.26 

4.66 

10 

2.81 

4.28 

3.29 

4.82 

If  the  data  is 

discrete,  the  D-B 

method 

can 

be  modified  as 

in  Dunn  (1964)  and  the  random  assignment  done 

on 

average  ranks. 

Thus 

ties  present  no 

>  problems  in  this 

approach. 

For  all  practical  purposes  the  exact  D-B 

(use  of  the  d . . . 

and  simulation)  seems  the  best  method  to  use. 
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Appendix 


c  of  main  program  fop  sequential  ranoom  allocations 

C  HP  ! S  Tnf  NUMBER  OF  GROUPS*  ‘.'STM  ThF  NUmRER  OF  SIMIU.ATI £>ns,  K'S  THE  GROUP 

•g-afr?gSy  X*5  -THF— N^Hf  PS— Tfl-RF— ALt-OCA-TFO - 

DIMENSION  X(  100.11,2(20,  1001,K(2P1 
100  FORMAT  (  ) 

- ^94  -Re  *0-1-0  ft™ I  -HPy  -XS  i-H - : - : - - - — 

IF  (MR.EO.  01  GO  TO  101 
PE  AO  100,  (K(J),J*1,!'R) 

- - : - 

00  1  J=t,NR 
1  KXSKK+K ( J) 

- 00-^03-  I  e-V,X-H - 


203  X(l)=l 

on  2  p=i, msi* 

- e-Aft-R-AHf **(+n+ >-*->- **>-»->  ?->- 

2  CONTINUE 
cn  TO  09 

•m-WF - 


EMIT 

PFOR,  IS  ,  Si*9 1 

- FH+»€ . F-A-iQM  ONR/H  ,'KK  ,  *7  21 - - 

c  HP  NUMBER  OF  GROUPS,  K  APOAY  OF  GROUP  SIZES,  KK  NUMBER  OF  ELE^f 'ITS,  X  ARRAY 
C  (TF  KK  E1ENE9TS,  2  RANDOM  AU  flCATION  OF  X 

- B. »  ue  H5-I  «»N~  <HVt~S€  (-?  0  hS^hOtt  20  )  ,M  (  t  0  0  ■>?  ,-*(  2  >  ) - 

00  3  12=1, NR 

3  N$CCT2)=0  v  ‘ 

-frft-»3-3— - - — - - 


333  II(I21sPamL'N(X) 
00  n  13=1, nr 
- r-EFHH"; - 


4  0(I3)*K(t3) 

MAXI=N»-| 

— m = t ,  »<-i - 

DO  5  111=1, II 

5  OCIIIlsnCdn+OITII) 

— no  I4°i  ,*k - 


(l(Ta)s(KK-  td+l  .  >*MfT0) 

IF  (U(I«).LE.OCn)  GO  TO  61 

-00-7-1^= 


7  IF  (11(141. LE.OCITST)  GO  TO  62 
IF  (U(I«l.GT.OC(NR-t))  INOEXsnR 
- (K)  TO  -64 - 


-61  INOEXsl  • 

CO  TO  64  .. 

- 

64  m$C(In0E'')=‘!SC(Tm0FX)*1 
NNsNSCUNDEXl  .  . 

- 2H-HOEX-,  MM  lx  - — 


C  UPDATE 

O0  8  16=1, MR 

- ne-H-****-; - 


6  0(161=  K(T61-NSC(I61 
00  9  11=1, max! 

- Pfl—B— - 

9  nC(IT)snC(imn{U!) 

6  CONTINUE 

- Pf-TMOH - 

m  END 
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fir. 


J.  9  Lf 


JLSL 


I 


C  0-R  RY  SIMULATION 


c  kp-.is  the  simper  pf  croups.  nsim  the  nlmrer  of  simulations,  zal  ts  the  point 

<■  FO° — *t  f-p-* — Ef*F  P  P^fHi>»m.-f-T-Y— OP— t  pf — **  *-JM**H*» — OF — 4  i  L— T  Hp — A  a  St'l-HTE— V  M.-t+PS-  f*F - 

C  ST4Nn&p,->TZFri  R4NK  AVFR4GE  pifffRFmceS  EQUALLING  09  EXCFFDING  IT  IS  TO  RE 
C  CALCUL4TFP,  K'S  4DF  THE  GROUP  SIZES 
- e-H*e»?+0*— IFt 


Ot  HENS  ION  Xdoon  J,Z(20,  !f!«).K(?OJ 
10«  F pf?“ A  T  (  ) 

— T***!  \  S  f  H- 


if  (hr.eo.  o)  r,n  to  tot 

MAXI=N«-t 


«EAO  100, J=1,N9) 
KKsO 


- RfV  -t- J»l  rH9 - - - - - - - - - 

1  KKSKK+KCJ) 

nn  20?  1st » kk 

20>-»U)sI - - - 

CONsKK*  C^K*U*Z4L**2 

COUNTS 0 . 

- RO  2-  If  1,  NS  l»i - - - 

CALL  RAN0M(M9,K,KK,X,Z1 
00  20  Jisl.tiR 

- . . — - - 

TjyppsKCJn 
PO  20  J2s1,n:jpp 

20  P(Jl  i-JH - - : — : - 

DO  21  Jlsi.-tAXI 

lLIHa.lt*! 

- PO-^-1  -JaaLfclM^B - - - - - - 

IFLAG(.M , J2l=n 

21  IF  (12*(K(J21.R(J1}-K(J1  >  *9  (J2))**2.GE.X  (.!!)*<  CJ2)*CPM*(K(jn+K(J2 

- H-)V-IFfc»G(J!  - — — - 

ipRnnsO 

00  22  Jlsl ,«AXI 

- bLIu"JI*4 - = - 

00  22  J2aLLlN,MR 

22  IPROPsIPROO+IFLAGUl,  J21 

- KF— (-jpRtm-fiF-.  i  )  COHN  T aCOUMT  »  1 - - - 

2  CONTINUE 

PBsCPUNT/NSt 4  - 

- B9I4*T  100,-(fe(l>yTslFNfn - -  - - - ______ 

PRINT  100,Z4L,NSIH,PR 
•  GO  TO  OR  -  *  ; 

•KOI— MOP - - - — - - - — - 

ENO 
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